Logistic regression in feature selection in data mining
نویسنده
چکیده
Predictive data mining in clinical medicine deals with learning models to predict patients' health. The models can be devoted to support clinicians in diagnostic, therapeutic, or monitoring tasks. Data mining methods are usually applied in clinical contexts to analyze retrospective data, thus giving healthcare professionals the opportunity to exploit large amounts of data routinely collected during their day-by-day activity. Moreover, clinicians can nowadays take advantage of data mining techniques to deal with the huge amount of research results obtained by molecular medicine, such as genetic or genomic signatures, which may allow transition from population-based to personalized medicine. This paper aims at throwing light on the oldest feature extraction method, namely, the Logistic Regression (LR). LR is useful for situations in which we want to predict the presence or absence of a characteristic or outcome based on values of a set of predictor variables. LR was used in heart attack classification in coronary Heart Disease (CHD). LR is used in Binary classification.
منابع مشابه
Classification and Comparative Study of Data Mining Classifiers with Feature Selection on Binomial Data Set
This paper describes about the performance analysis of different data mining classifiers before and after feature selection on binomial data set. Three data mining classifiers Logistic Regression, SVM and Neural Network classifiers are considered in this paper for classification. The Congressional Voting Records data set is a binomial data set investigated in this study is taken from UCI machin...
متن کاملTowards Structural Logistic Regression: Combining Relational and Statistical Learning
Inductive logic programming (ILP) techniques are useful for analyzing data in multi-table relational databases. Learned rules can potentially discover relationships that are not obvious in "flattened" data. Statistical learners, on the other hand, are generally not constructed to search relational data; they expect to be presented with a single table containing a set of feature candidates. Howe...
متن کاملExtracting Predictor Variables to Construct Breast Cancer Survivability Model with Class Imbalance Problem
Application of data mining methods as a decision support system has a great benefit to predict survival of new patients. It also has a great potential for health researchers to investigate the relationship between risk factors and cancer survival. But due to the imbalanced nature of datasets associated with breast cancer survival, the accuracy of survival prognosis models is a challenging issue...
متن کاملOptimal Feature Selection for Data Classification and Clustering: Techniques and Guidelines
In this paper, principles and existing feature selection methods for classifying and clustering data be introduced. To that end, categorizing frameworks for finding selected subsets, namely, search-based and non-search based procedures as well as evaluation criteria and data mining tasks are discussed. In the following, a platform is developed as an intermediate step toward developing an intell...
متن کاملMulti-label Classification using Logistic Regression Models for NTCIR-7 Patent Mining Task
We design a multi-label classification system based on a machine learning approach for the NTCIR-7 Patent Mining Task. In our system, we employ a logistic regression model for each International Patent Classification (IPC) code that determines the IPC code assignment of research papers. The logistic regression models are trained by using patent documents provided by task organizers. To mitigate...
متن کامل